A learning framework for the optimization and automation of document binarization methods

نویسندگان

  • Mohamed Cheriet
  • Reza Farrahi Moghaddam
  • Rachid Hedjam
چکیده

Almost all binarization methods have a few parameters that require setting. However, they do not usually achieve their upper-bound performance unless the parameters are individually set and optimized for each input document image. In this work, a learning framework for the optimization of the binarization methods is introduced, which is designed to determine the optimal parameter values for a document image. The framework, which works with any binarization method, has a standard structure, and performs three main steps: i) extracts features, ii) estimates optimal parameters, and iii) learns the relationship between features and optimal parameters. First, an approach is proposed to generate numerical feature vectors from 2D data. The statistics of various maps are extracted and then combined into a final feature vector, in a nonlinear way. The optimal behavior is learned using support vector regression (SVR). Although the framework works with any binarization method, two methods are considered as typical examples in this work: the grid-based Sauvola method, and Lu’s method, which placed first in the Preprint submitted to Elsevier November 20, 2012 DIBCO’09 contest. The experiments are performed on the DIBCO’09 and H-DIBCO’10 datasets, and combinations of these datasets with promising results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A learning framework for automation and optimization of document binarization methods (The supplementary material)

B The multilevel maps 4 B.1 Color maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 B.2 Stroke map (SM) and gray stroke map (GSM) . . . . . . . . . 5 B.3 Stroke gray map (SGM) . . . . . . . . . . . . . . . . . . . . . 8 B.4 Stroke gray level (SGL) . . . . . . . . . . . . . . . . . . . . . . 8 B.5 Background gray level (BGL) . . . . . . . . . . . . . . . . . . 10 B.6 Estimated backg...

متن کامل

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

Automatic road crack detection and classification using image processing techniques, machine learning and integrated models in urban areas: A novel image binarization technique

The quality of the road pavement has always been one of the major concerns for governments around the world. Cracks in the asphalt are one of the most common road tensions that generally threaten the safety of roads and highways. In recent years, automated inspection methods such as image and video processing have been considered due to the high cost and error of manual metho...

متن کامل

A Practical Self-Assessment Framework for Evaluation of Maintenance Management System based on RAMS Model and Maintenance Standards

A set of technical, administrative and management activities are done in the life cycle of equipment, to be located in good condition and have proper and expected functioning. This is refers to be, maintenance management system (MMS). The framework and models of assessment in order to enhance effectiveness of a MMS could be proposed in two categories: qualitative and quantitative. In this resea...

متن کامل

Learning To Binarize Document Images

Document images produced by cameras often have varying degrees of brightness. To resolve the problem, we propose a method that divides an image into several regions and decides what binarization action to take on each region based on the rules that are derived from a learning process. Since each region can allow more than one action to take, we are dealing with a multi-label and multi-class cla...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer Vision and Image Understanding

دوره 117  شماره 

صفحات  -

تاریخ انتشار 2013